feat: fix N+1 problem#505
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
1e57d5a to
97d04c5
Compare
|
📘 OpenAPI / Swagger preview ➡️ https://OpenStackweb.github.io/summit-api/openapi/pr-505/ This page is automatically updated on each push to this PR. |
|
📘 OpenAPI / Swagger preview ➡️ https://OpenStackweb.github.io/summit-api/openapi/pr-505/ This page is automatically updated on each push to this PR. |
…egorySerializer (typo '.')
…tch-load support - Add $expands parameter to DoctrineRepository::getAllByPage - Add addExpandFetchJoins call (toOne FETCH JOINs before pagination) - Add batchLoadExpandedRelations call (toMany IN-clause batching after hydration) - Add protected static getExpandFieldMap() hook (returns [] by default, subclasses override) - Use static::getExpandFieldMap() for late static binding so subclass maps resolve correctly - Make DoctrineSummitEventRepository::getExpandFieldMap() public static to allow static call from tests - Propagate array $expands = [] signature to all concrete getAllByPage overrides and interfaces
|
📘 OpenAPI / Swagger preview ➡️ https://OpenStackweb.github.io/summit-api/openapi/pr-505/ This page is automatically updated on each push to this PR. |
There was a problem hiding this comment.
Pull request overview
This PR introduces expand-aware eager/batch loading for SummitEvent retrieval (including nested dot-notation expands) to reduce Doctrine N+1 query patterns, and threads the expand request parameter down into repositories so the loader can act on it.
Changes:
- Add generic expand-driven fetch-join (toOne) and batch hydration (toMany) support to
GraphLoaderTrait, including recursive nested expands. - Extend
getAllByPagerepository APIs to accept an optional$expandsarray and propagateexpandfrom the Summit Events API controller/strategies. - Add unit/integration tests validating expand maps and query-count reductions for batch loading and nested expands.
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/Unit/Repositories/SummitEventNestedExpandBatchLoadingTest.php | Integration test verifying nested expands reduce level-2 speaker relation queries. |
| tests/Unit/Repositories/SummitEventExpandMapTest.php | Unit test asserting DoctrineSummitEventRepository expand-name → Doctrine-field mismatch map invariants. |
| tests/Unit/Repositories/SummitEventExpandBatchLoadingTest.php | Integration tests measuring query reduction across expands, paging sizes, and data consistency. |
| tests/Unit/Repositories/GraphLoaderNestedExpandTest.php | Unit test suite for recursive nested expand resolution, overrides, and resolvers in GraphLoaderTrait. |
| app/Repositories/Summit/DoctrineSummitOrderExtraQuestionTypeRepository.php | Updates getAllByPage signature to accept $expands. |
| app/Repositories/Summit/DoctrineSummitEventRepository.php | Adds expand field map + implements expand-driven fetch-joins and batch/nested loading for events. |
| app/Repositories/Summit/DoctrineSummitAttendeeTicketRepository.php | Updates getAllByPage signature to accept $expands. |
| app/Repositories/Summit/DoctrineSponsorExtraQuestionTypeRepository.php | Updates getAllByPage signature to accept $expands. |
| app/Repositories/Summit/DoctrineSpeakerRepository.php | Updates getAllByPage signature to accept $expands. |
| app/Repositories/Summit/DoctrineMemberRepository.php | Updates getAllByPage signature to accept $expands. |
| app/Repositories/ResourceServer/DoctrineEndPointRateLimitByIPRepository.php | Updates getAllByPage signature to accept $expands (method still TODO). |
| app/Repositories/DoctrineRepository.php | Adds GraphLoaderTrait and wires expand-driven fetch-joins + batch/nested loading into the generic getAllByPage. |
| app/Models/Utils/IBaseRepository.php | Extends repository interface getAllByPage signature to include $expands. |
| app/Models/Foundation/Summit/Repositories/ISummitEventRepository.php | Extends summit event repository interface getAllByPage signature to include $expands. |
| app/Models/Foundation/Main/Repositories/IMemberRepository.php | Extends member repository interface getAllByPage signature to include $expands. |
| app/libs/Utils/Doctrine/GraphLoaderTrait.php | Core implementation for expand fetch-joins, batch hydration, and recursive nested expand loading. |
| app/Http/Controllers/Apis/Protected/Summit/Strategies/events/RetrieveSummitEventsStrategy.php | Parses expand string into an array and passes it into the strategy source retrieval. |
| app/Http/Controllers/Apis/Protected/Summit/Strategies/events/RetrieveAllSummitEventsStrategy.php | Propagates $expands into the repository call. |
| app/Http/Controllers/Apis/Protected/Summit/Strategies/events/RetrieveAllSummitEventsBySummitStrategy.php | Propagates $expands into the repository call. |
| app/Http/Controllers/Apis/Protected/Summit/OAuth2SummitEventsApiController.php | Ensures expand is passed to strategies (so repositories can batch-load accordingly) and reused for serialization. |
Comments suppressed due to low confidence (1)
tests/Unit/Repositories/SummitEventExpandBatchLoadingTest.php:356
- runFullScenario() also attaches a SQL logger and detaches it only on the happy path. If the repository call or relation touching throws, the logger remains attached and can affect subsequent tests. Use try/finally to guarantee detachQueryLogger() runs.
private function runFullScenario(PagingInfo $paging, Filter $filter, array $expands): array
{
$this->clearIdentityMap();
$debug = $this->attachQueryLogger();
$response = $this->repository->getAllByPage($paging, $filter, null, $expands);
$items = $response->getItems();
$this->touchAllRelations($items);
$count = count($debug->queries);
$categories = $this->categorizeQueries($debug);
$this->detachQueryLogger();
return ['count' => $count, 'categories' => $categories, 'items' => $items];
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| $qb->shouldReceive('from')->andReturnUsing(function ($class) use ($qb, &$currentFrom) { | ||
| $currentFrom = $class; | ||
| return $qb; | ||
| }); | ||
|
|
||
| $qb->shouldReceive('leftJoin')->andReturnUsing(function ($join) use ($qb, &$currentJoins) { | ||
| $currentJoins[] = $join; | ||
| return $qb; | ||
| }); |
| // Determine child entity class and metadata | ||
| $firstChild = reset($childEntities); | ||
| $childClass = get_class($firstChild); | ||
|
|
||
| try { | ||
| $childMeta = $em->getClassMetadata($childClass); | ||
| } catch (\Exception $ex) { |
| @@ -155,54 +241,475 @@ protected function hydrateCollectionWithNestedToOne( | |||
| } | |||
| $this->clearIdentityMap(); | ||
| $debug = $this->attachQueryLogger(); | ||
|
|
||
| $response = $this->repository->getAllByPage($paging, $filter, null, $expands); | ||
| $items = $response->getItems(); | ||
| $this->touchRelations($items, $touchRelations); | ||
|
|
||
| $count = count($debug->queries); | ||
| $categories = $this->categorizeQueries($debug); | ||
| $this->detachQueryLogger(); | ||
|
|
| $this->clearIdentityMap(); | ||
| $debug = $this->attachQueryLogger(); | ||
|
|
||
| $response = $this->repository->getAllByPage($paging, $filter, null, $expands); | ||
| $items = $response->getItems(); | ||
| $this->touchSpeakerRelations($items, $touchNested); | ||
|
|
||
| $count = count($debug->queries); | ||
| $this->detachQueryLogger(); | ||
|
|
||
| return ['count' => $count, 'items' => $items]; |
…ase timing in getAllByPage Adds ms-precision breakdown: count query, id query, hydration query, batchLoad. Was using integer time() which masks sub-second differences.
|
📘 OpenAPI / Swagger preview ➡️ https://OpenStackweb.github.io/summit-api/openapi/pr-505/ This page is automatically updated on each push to this PR. |
…tAllByPage SummitEventType uses JOINED inheritance with a ClassName discriminator so Doctrine already hydrates the correct PresentationType subclass automatically via innerJoin e.type. The explicit leftJoin(PresentationType::class, et2) in the hydration query was redundant and caused Doctrine to emit PresentationType as a separate root entity in getResult(), silently dropping those events from the byId map and logging unexpected hydration row warnings. The et2 join is kept in getFastCount and getAllIdsByPage where it is needed for type_allows_attendee_vote and type_allows_custom_ordering filter predicates.
|
📘 OpenAPI / Swagger preview ➡️ https://OpenStackweb.github.io/summit-api/openapi/pr-505/ This page is automatically updated on each push to this PR. |
… cache in getEvents Three causes of ~1100ms serializer overhead: 1. Presentation::getSpeakers() used matching() on an EXTRA_LAZY collection which always issues a DB query regardless of whether batchLoadExpandedRelations had already pre-loaded the assignments. Replaced with toArray() + PHP usort so that an already-initialized collection (batch-loaded) is sorted in memory with zero DB round-trips. Falls back to a single full-load query on cold collections. 2. getSpeaker() on each PresentationSpeakerAssignment triggered one lazy-load per speaker because batchLoadExpandedRelations only pre-fetched the assignments, not the speaker FK. Injecting speakers.speaker into the batch expands causes batchLoadNestedExpands to load all unique PresentationSpeaker entities in one IN-clause query, so getSpeaker() returns the already-initialized identity-map entry. 3. getEvents() did not pass use_cache=true to the serializer params, so PresentationSerializer skipped its Redis cache entirely. Adding it means warm requests serialize from cache (Redis GET + JSON decode) instead of re-running the full field-mapping and expand loop per presentation.
|
📘 OpenAPI / Swagger preview ➡️ https://OpenStackweb.github.io/summit-api/openapi/pr-505/ This page is automatically updated on each push to this PR. |
…for automatic invalidation Without this, a presentation update left stale cached data for up to 20 minutes (CacheTTL=1200s). Including the last_edited Unix timestamp in the key means any update to the presentation changes the key, so the old entry is silently ignored and ages out via TTL — no explicit Cache::forget needed anywhere.
|
📘 OpenAPI / Swagger preview ➡️ https://OpenStackweb.github.io/summit-api/openapi/pr-505/ This page is automatically updated on each push to this PR. |
…nse::toArray Adds: - OAuth2SummitEventsApiController::getEvents: logs total serializer ms after toArray() - PagingResponse::toArray: logs per-item entity id + ms to identify slow outliers Remove before merging to main.
|
📘 OpenAPI / Swagger preview ➡️ https://OpenStackweb.github.io/summit-api/openapi/pr-505/ This page is automatically updated on each push to this PR. |
… speaker Member Two remaining N+1 sources on the cold-cache path: 1. GraphLoaderTrait::batchHydrateCollections ran the fetch-join DQL correctly but Doctrine's ObjectHydrator does not always mark EXTRA_LAZY PersistentCollections as initialized. Subsequent iteration (e.g. getSponsors(), getTags()) therefore still fired one SELECT per entity per collection. After getResult() returns, explicitly call setInitialized(true) on each root entity's collection via ClassMetadata::reflFields — valid because the DQL loaded ALL elements for these root IDs into the UnitOfWork. 2. PresentationSpeaker::getFirstName()/getLastName() fall back to $this->member->getFirstName() when the speaker's own field is empty, triggering one Member lazy-load per speaker. Outlier events (7570: 462ms, 7573: 322ms) had many such speakers. Adding speakers.speaker.member to the internal batch expands causes batchLoadNestedExpands to load all Member entities in one IN-clause query.
|
📘 OpenAPI / Swagger preview ➡️ https://OpenStackweb.github.io/summit-api/openapi/pr-505/ This page is automatically updated on each push to this PR. |
…Y force-init tryGetById() was returning false for all entities (API signature mismatch with Doctrine version in use), so setInitialized(true) was never called and the EXTRA_LAZY collections remained uninitialized — iteration still fired per-entity DB queries on every request for non-Presentation events (sponsors, tags). em->find() with an entity already in the identity map (guaranteed after the main hydration query runs) costs zero DB queries and correctly returns the entity so we can force-initialize its PersistentCollection.
|
📘 OpenAPI / Swagger preview ➡️ https://OpenStackweb.github.io/summit-api/openapi/pr-505/ This page is automatically updated on each push to this PR. |
…force-init
Previous attempt passed ClassMetadata as second arg (must be FQCN string)
and em->find() for JOINED-inheritance entities was issuing SELECT queries
even for in-identity-map entities, making times worse.
Correct call: tryGetById(scalar_id, meta->rootEntityName)
- meta->rootEntityName is the root class FQCN string for the hierarchy
- Doctrine hashes identity map keys as implode(' ', (array)$id) so passing
the scalar id directly is correct
- Zero DB queries: reads only the UnitOfWork identity map
|
📘 OpenAPI / Swagger preview ➡️ https://OpenStackweb.github.io/summit-api/openapi/pr-505/ This page is automatically updated on each push to this PR. |
…query The generic nested-expand recursion (speakers.speaker / speakers.speaker.member) could not batch-load the speaker FK because the speakers resolver ($assignment->getSpeaker()) is invoked during the iteration step to gather child entities — that step itself fires one EXTRA_LAZY lazy-load per assignment before the batch-load can run. The recursion then never finds a 'speaker' field on PresentationSpeaker and silently no-ops. Replace the broken nested batch with a single direct DQL: SELECT s, m FROM PresentationSpeakerAssignment a JOIN a.speaker s LEFT JOIN s.member m WHERE a.presentation IN (:ids) This loads ALL PresentationSpeaker entities AND their Member entities for the requested presentations in one query, regardless of how many speakers each has. After this preload runs: - $assignment->getSpeaker() hits the identity map (zero queries) - $speaker->getMember() returns an already-initialized entity - PresentationSpeaker::getFirstName()/getLastName() fallback to $this->member->getFirstName() costs nothing Removed the now-useless speakers.speaker + speakers.speaker.member additions to $batchExpands since they only contributed wasted N lazy-loads via the resolver.
|
📘 OpenAPI / Swagger preview ➡️ https://OpenStackweb.github.io/summit-api/openapi/pr-505/ This page is automatically updated on each push to this PR. |
Non-Presentation events (SummitEvent, SummitGroupEvent, SummitEventWithFile) had no caching at all and went through full serialization (~60-170ms per event) on every request. Mirror the PresentationSerializer cache pattern: - Cache key includes static::class so each subclass (SummitGroupEventSerializer, SummitEventWithFileSerializer) gets its own namespace and they don't collide - last_edited timestamp in the key gives automatic invalidation on entity update without needing explicit Cache::forget anywhere - Skipped when $this instanceof PresentationSerializer — Presentations already have their own, more specific cache at the presentation level, and double caching would just waste Redis space - 1200s TTL matches PresentationSerializer Expected: SummitGroupEvent and plain-SummitEvent rows drop from 60-170ms to ~6-12ms on warm cache (matching Presentation cache behavior).
|
📘 OpenAPI / Swagger preview ➡️ https://OpenStackweb.github.io/summit-api/openapi/pr-505/ This page is automatically updated on each push to this PR. |
…skip re-serialization on hit The cache-hit path was re-serializing the full media_uploads collection every time because getMediaUploadsSerializerType() depends on the current user role and the cached value could have been written by a request with a different role. Trace 885fa70613349e23d4f6719e36b8a513 from dev showed all 10 events as cache HITS yet still totaled 1274ms — the re-serialization of media_uploads (10+ items per event with sub-serializer calls and possible lazy-loads) was the remaining cost. Fix: include getMediaUploadsSerializerType() in the cache key. Now each user role gets its own cache entry containing fully-serialized media_uploads, and the cache-hit path just decodes JSON and returns — no per-hit re-serialization.
|
📘 OpenAPI / Swagger preview ➡️ https://OpenStackweb.github.io/summit-api/openapi/pr-505/ This page is automatically updated on each push to this PR. |
The diagnostic logging added for finding the cache bottleneck was itself adding significant overhead per request (15-30 extra Log::debug writes for a 10-item page, each with JSON-encoded context). On a verbose-logging dev server this could easily add 50-200ms of pure logging overhead per request. Removed: - PagingResponse::toArray per-item id/ms log - OAuth2SummitEventsApiController::getEvents serializer ms log - DoctrineSummitEventRepository::getAllByPage per-phase logs (count, ids, hydration, batchLoad, speakerMemberPreload); kept just the final total ms log - PresentationSerializer cache-hit log (fires on every cache hit) - SummitEventSerializer cache-hit log (fires on every cache hit) Behaviour is unchanged; only the logging is removed.
|
📘 OpenAPI / Swagger preview ➡️ https://OpenStackweb.github.io/summit-api/openapi/pr-505/ This page is automatically updated on each push to this PR. |
…t for cache lookup Previously each cache hit fired two Redis round-trips (EXISTS then GET). On a network where Redis RTT is ~5ms each, that's ~10ms per cache lookup. For a 10-item page where every event is a cache hit, that's ~100ms of pure Redis network overhead. Replace with a single Cache::get and a null check — Redis returns null when the key does not exist, so we get the same semantics in one round-trip.
|
📘 OpenAPI / Swagger preview ➡️ https://OpenStackweb.github.io/summit-api/openapi/pr-505/ This page is automatically updated on each push to this PR. |
No description provided.